MESOSCOPIC FLUCTUATIONS OF THE ZETA ZEROS 



P. BOURGADE 

Abstract. We prove a multidimensional extension of Selberg's central 
limit theorem for log ^, in which non-trivial correlations appear. In 
particular, this answers a question by Coram and Diaconis about the 
mesoscopic fluctuations of the zeros of the Riemann zeta function. 

Similar results are given in the context of random matrices from 
the unitary group. This shows the correspondence n <-> \ogt not only 
between the dimension of the matrix and the height on the critical line, 
but also, in a local scale, for small deviations from the critical axis or 
the unit circle. 



Remark. All results below hold for L-functions from the Selberg class, for 
concision we state them for 

In this paper we talk about correlations between random variables to ex- 
press the idea of dependence, which is equivalent as all the involved variables 
are Gaussian. 

The Vinogradov symbol, ^ bn, means a„ = 0{bn), and a„ ^ 6„ means 
bn ^ CLn- In this paper, we implicitly assume that, for all n and t, e„ > 0, 
et > 0. 

1. Introduction 

1.1. Main result. Selberg's central limit theorem states that, if lo is uni- 
form on (0, 1), then 

logC(^ + ia^t) Jaw 
Vlog log t 

as t — > oo, Y being a standard complex normal variable (see paragraph 1.4 
below for precise definitions of log and complex normal variables) . This 
result has been extended in two distinct directions, both relying on Selberg's 
original method. 

First similar central limit theorems appear in Tsang's thesis [15] far away 
from the critical axis, and Joyner [9] generalized these results to a larger 
class of L-functions. In particular, (1.1) holds also for logC evaluated close 
to the critical axis (1/2 + St + it^i) provided that £t <C 1/logt; for — > 
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and £t 3> 1/ logt, Tsang proved that a change of normahzation is necessary: 

\ogC{\ + et + \wt) ^ 

V-log£t 

with LO uniform on (0, 1) and Y' a standard complex normal variable. 

Second, a multidimensional extension of (1.1) was given by Hughes, Nikegh- 
bali and Yor [8] , in order to get a dynamic analogue of Selberg's central limit 
theorem : they showed that for any < Ai < • • • < 

' ^ogC + ia;e(^°^*)^^) , • • • ,logC + ia.eC°s*)^^)) 



Vlog logt 



^(AiFi,...,A,y,), (1.3) 



all the Y^s being independent standard complex normal variables. The 
evaluation points \ + ia;e('°g*) in the above formula are very distant from 
each other and a natural question is whether, for closer points, a non-trivial 
correlation structure appears for the values of zeta. Actually, the average 
values of log^ become correlated for small shifts, and the Gaussian kernel 
appearing in the limit coincides with the one of Brownian motion off the 
diagonal. More precisely, our main result is the following. 



Theorem 1.1. Let oj he uniform on (0,1), et et 1/logt, and 

f<c< 

log 1/,'^' -/,'■'! 



functions < f^^^ < ■ ■ ■ < f^^^ < c < oo. Suppose that for all i ^ j 



loge* 



CijG[0,oo]. (1.4) 



Then the vector 



(log c[\+e, + i/f ) + ict) , . . . , log C + + i/f + i.t^ 

(1.5) 

converges in law to a complex Gaussian vector (Yl , . . . , Y^) with mean and 
covariance function 

coY{Yi,Y) = i , } 'Z^- ■ (1.6) 

Moreover, the above result remains true if et <^ 1/logt, replacing the nor- 
malization — log£t with log log t in (1-4) o,nd (1-5). 

The covariance structure (1.6) of the limit Gaussian vector actually de- 
pends only on the I — 1 parameters ci^2) • • • i Q-i/ because formula (1.4) 
implies, for all i < k < j, Cij = Ci^k ^ c^j. We will explicitly construct 
Gaussian vectors with the correlation structure (1.6) in section 4. 
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We now illustrate Theorem 1.1. Take £ = 2, et ^ 0, et ^ 1/logt. Then 



for any < S <1 and u uniform on (0, 1), choosing f^^^ = and 



(2) 



"5 log St 
converges in law to 



log 



- + £< + ILOt 



,log 



C{- + et + iiot + ie? 



{AfuSAfi + Vl- W), (1.7) 

where A/i and N'2 are independent standard real normal variables. A similar 
result holds if £t <^ 1/logt, in particular we have a central limit theorem on 
the critical axis £t = : 



y^iloglogt 











l^log 


cQ+i-t) 


,log 





'1 



{logty 



also converges in law to (1.7). Note the change of normalization according to 

(i) 

Et, i.e. the distance to the critical axis. Finally, if all shifts arc constant 
and distinct, Cjj = for all i and j, so the distinct means of converge in 
law to independent complex normal variables, after normalization. 

Remark. In this paper we are concerned with distinct shifts along the or- 
dinates, in particular because it implies the following Corollary 1.3 about 
counting the zeros of the zeta function. The same method equally applies 
to distinct shifts along the abscissa, not enounced here for simplicity. For 
example, the Gaussian variables Y and Y' in (1.1) and (1.2) have correlation 
1 A if £t = l/(logt)'' with ^ > 0. 

Theorem 1.1 can be understood in terms of Gaussian processes : it has 
the following immediate consequence, enounced for = for simplicity. 

Corollary 1.2. Let uj be uniform on (0,1). Consider the random function 



Vloglogt 



log 



c 



1 



+ iuit ■ 



,0<6 <1 



(logt)^ 

Then its finite dimensional distribution converge, as t ^ 00, to those of a 
centered Gaussian process with kernel Tj^^ = 7 A 5 i/7 / 5, 1 if 'y = S. 

There is an effective construction of a centered Gaussian process {Xs,0 < 
5 < 1) with covariance function F-y^^ : let (.65,0 < (5 < 1) be a standard 
Brownian motion and independently let {Ds,0 < S < 1) he a totally dis- 
ordered process, meaning that all its coordinates are independent centered 
Gaussians with variance E(-D^) = S. Then 

Xs = Bs + Di_s 

defines a Gaussian process with the desired covariance function. Note that 
there is no measurable version of this process : if there were, then {Ds, < 
(5 < 1) would have a measurable version which is absurd because, by Fubini's 
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Theorem, for all < a < 6 < 1 E 




= 0, so Dgdd = a.s. 



and Ds = a.s. giving the contradiction. 

1.2. Counting the zeros. Theorem 1.1 also has a strange consequence for 
the counting of zeros of C on intervals in the critical strip. Write N{t) for 
the number of non-trivial zeros z of ( with < 3mz < t, counted with their 
multiplicity. Then (see e.g. Theorem 9.3 in Titchmarsh [14]) 

= If + ^'""°«5< (1/2 + it) + ^ + O (i) (1.8) 

with 3m log C (1/2 + it) = 0(log t). For ti < t2 we will write 

A«. y = W.) - iV(tO) - (I log 1^ - I log ^) , 

which represents the fluctuations of the number of zeros z {ti < 3mz < ^2) 

minus its expectation. A direct consequence of Theorem 1.1, choosing i = 2, 
f^^\t) = and /(2)(t) = (0 < 5 < 1), is the following central limit 

theorem obtained by Fujii [4]: 

A(ujt,Ujt+ jr-^] , 

^Vloglogt 

as t ^ 00, where a; is uniform on (0, 1) and is a standard real normal 
variable. A more general result actually holds, being a direct consequence of 
Theorem 1.1 and (1.8). This confirms numerical experiments by Coram and 
Diaconis [1], who after making extensive tests (based on data by Odlyzko) 
suggested that the correlation structure (1.9) below should appear when 
counting the zeros of (. Following [1] the phenomenon presented below can 
be seen as the mesoscopic repulsion of the zeta zeros, different from the 
Montgomery-Odlyzko law, describing the repulsion at a microscopic scale. 

Corollary 1.3. Let (Kt) be such that, for some e > and all t, Kt > e. 
Suppose log i^t / log log t ^ G [0, 1) as t ^ 00. Then the finite dimensional 
distributions of the process 

Ajut + o/Ki.^l + :3/Kt) 

— ^"1 — , , < a < p < 00 

iV(l-'5)loglogt 

converge to those of a centered Gaussian process (A(a,/3),0 < a < (3 < 00) 
with the covariance structure 

1 if a = a' and P = P' 

1/2 ifa = a' and P ^ P' 

1/2 ifa^a' and P = P' . (1.9) 

-1/2 ifp = a' 

elsewhere 



E(^A{a,P)A{a',P')) = < 
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This correlation structure is surprising : for example A(a, /3) and A(q', (3') 
are independent if the segment [a,P] is strictly included in [a',/?'], and 
positively correlated if this inclusion is not strict. Note that there is again 
an effective construction of A : if {Ds,5 > 0) is a real valued process with 
all coordinates independent centered Gaussians with variance E(I)|) = 1/2, 
then 

has the required correlation structure. Concerning the discovery of this 
exotic Gaussian correlation function in the context of unitary matrices, see 
the remark after Theorem 1.4. 

1.3. Analogous result on random matrices. We note Z{un,X) the 
characteristic polynomial of a matrix Un € U{n), and often abbreviate it 
as Z. Theorem 1.1 was inspired by the following analogue (Theorem 1.4) in 
random matrix theory. This confirms the validity of the correspondence 

n log t 

between the dimension of random matrices and the length of integration 
on the critical axis, but it also supports this analogy at a local scale, for 
the evaluation points of log Z and log : the necessary shifts are strictly 
analogue both for the abscissa\radius (sn \et) and the or dinate\ angle (/*^*) \ 
^«). 

Theorem 1.4. Let Un ~ l^u{n)j £n ^ 0, En ^ ^/n, and functions < 
(pn^ < ■ ■ ■ < ipn^ < 27r — 5 for some 6 > Q. Suppose that for all i ^ j 

^c,,e[o,oo]. (1.10) 

Then the vector 

r4 (log ^K' e^"+'^" \ . . . , log Z{un, e^"+''^" ' )) (1.11) 

converges in law to a complex Gaussian vector with mean and covariance 
function (1-6). Moreover, the above result remains true if £n ^ ^/n, replac- 
ing the normalization — loge^ with \ogn in (1-10) and (1.11). 

Remark. Let Nn{a, /3) be the number of eigenvalues e'^ of Un with a < 9 < P, 
and 5n{c(,(3) = Nn{a,l3) — E^^^^^^ (iV„(a, /3)). Then, a little calculation (see 
[7]) yields 

5n{a,f3) = ^ (3mlogZ{un,e'f^) - Dfmlog Z(?x„, e'")) 
This and the above theorem imply that, as n — ^ oo, the vector 

^ (54^i^\^i^^),^n(^^^\^f ),...,<5„(^r^\^(f))) . 

converges in law to a Gaussian limit. Central limit theorems for the counting- 
number of eigenvalues in intervals were discovered by Wieand [16] in the 
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special case when all the intervals have a fixed length independent of n (in- 
cluded in the case Cij = for all i, j). Her result was extended by Diaconis 

and Evans to the case (fn^ = tp^^^ / Kn for some Kn oo, K^jn (i.e. qj 
is a constant independent of i and j) : Corollary 1.3 is a number-theoretic 
analogue of their Theorem 6.1 in [2]. 

Note that, in the general case of distinct Ci^i+i's, a similar result holds 
but the correlation function of the limit vector is not as simple as the one in 
Corollary 1.3 : it strongly depends on the relative orders of these coefficients 

Ci,i+l's. 

1.4. Definitions, organization of the paper. In this paper, for more 
concision we will make use of the following standard definition of complex 
Gaussian random variables. 

Definition 1.5. A complex standard normal random variable Y is defined as 
-^(A/i + iA/'2), M\ and M2 being independent real standard normal variables. 
For any A,// G C, we will say that A -|- \^Y is a complex normal variable 
with mean A and variance The covariance of two complex Gaussian 

variables Y and Y' is defined as cov(y,F') = E(Fy') - E(F)E(y'), and 
Var(y) = cov(y,y). 

A vector (li, . . . , Y^) is a complex Gaussian vector if any linear combi- 
nation of its coordinates is a complex normal variable. For such a com- 
plex Gaussian vector and any 11 = {pi, . . . G C^, X]i=i ^^k^k has vari- 
ance JiC^iJ:, where C is said to be the covariance matrix of (Yi,...,!^) : 
Cij = cov {Yi,Yj). 

As in the real case, the mean and the covariance matrix characterize a 
complex Gaussian vector. 

Moreover, precise definitions of log and log Z[X) are necessary : for 
0" > 1/2, we use the standard definition 



roo /-/ 

logC{a + it) = - J i-(s + it)ds 



if C has no zero with ordinate t. Otherwise, log ({a + it) = limg^o log C(^ + 
i(t + e)). 

Similarly, let u ~ /U^(„) have eigenvalues e'^^, .... e'^". For \X\ > 1, the 
principal branch of the logarithm of Z(X) = det(Id — X^^u) is chosen as 



logZ(X) = J]log 1-^ U-j; 

k=\ ^ ^ 7=1 



1 Tr(ttJ) 



J-' 



Following Diaconis and Evans [2], if X^ X with > 1 and \X\ = 1, 
then log Z{Xn) converges in to — Yl^i 7 ^x^^ ' therefore this is our def- 
inition of logZ(X) when \X\ = 1. 
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We will successively prove Theorems 1.4 and 1.1 in the next two sections. 
They are independent, but we feel that the joint central limit theorem for ( 
and its analogue for the random matrices are better understood by compar- 
ing both proofs, which are similar. In particular Proposition 3.1, which is 
a major step towards Theorem 1.1 is a strict number-theoretic analogue of 
the Diaconis- Evans theorem used in the next section to prove Theorem 1.4. 

Finally, in Section 4, we show that the same correlation structure as (1.6) 
appears in the theory of spatial branching processes. 

2. The central limit theorem for random matrices. 

2.1. The Diaconis- Evans method. Diaconis and Shahshahani [3] looked 
at the joint moments of Tru, Trit^, . . . , Trtt^ for u ~ ^^u(n)-, ^"^^ showed 
that any of these moments coincides with the ones 

for sufficient large n, the l^'s being independent standard complex normal 
variables. This suggests that under general assumptions, a central limit 
theorem can be stated for linear combinations of these traces. 

Indeed, the main tool we will use for the proof of Theorem 1.4 is the 
following result. 

Theorem 2.1 (Diaconis, Evans [2]). Consider an array of complex con- 
stants {anj I n G N, j G N}. Suppose there exists such that 

oo 

lim V I a„jf(i An) = (7^. (2.1) 

Suppose also that there exists a sequence of positive integers {rUn \ n G N} 
such that limn^^nin/n = and 

oo 

lim |a„jp(j An) = 0. (2.2) 

n— >oo ' 

j=m„+l 

Then '^'jLi o,nj Tr Un converges in distribution to aY, where Y is a complex 
standard normal random variable and u„ ~ fJ-uin) ■ 

Thanks to the above result, to prove central limit theorems for class func- 
tions, we only need to decompose them on the basis of the traces of successive 
powers. This is the method employed in the next subsections, where we treat 
separately the cases £„ S> 1/n and ^ 1/n. 

2.2. Proof of Theorem 1.4 for ^ 1/n. From the Cramer- Wald de- 
vice^ a sufficient condition to prove Theorem 1.4 is that, for any (/xi, . . . , G 



A Borcl probability measure on R is uniquely determined by the family of its one- 
dimensional projections, that is the images of by (xi,. . . ,xe) i-^ X)^=i-^3^3' ^'-'^ 
vector {\j)i<j<e € R^. 



p. BOURGADE 



converges in law to a complex normal variable with mean and variance 
^^ = IIlMiP + I]M7Mt(c.,tAl). (2.3) 

i=l s^t 

We need to check conditions (2.1) and (2.2) from Theorem 2.1, with 



V-iog£„ yJt^ jeJ(^"+^'''" ) / 
First, to calculate the limit of 



|2 



j=\ j=l j=n+l 

note that this second term tends to : if a = {Ylk=i iMfeD^j then 



(-iog£„)n y^ I 

j=n+l 

SO ra ^ 



n y 

j=n+l 



E 



~ 1 
< an ^ < o 



j=n+l 



1'^niP ~^ 0. The first term can be written 



{- log £n)yj\anj\'^ = yj 



S,t j = l •' 



/ (S) (t)^ \ J 



Hence the expected limit is a consequence of the following lemma. 

Lemma 2.2. Led Sn ^ l/ra, e„ — >■ 0, (A„) he a strictly positive sequence, 
bounded by 2?: — 6 for some 5 > 0, and log A„/ log e„ — > c G [0, oo] . Then 

> -^7^ > cAl. 



■ log£„ ■^-^ je'^jsn n— >oo 



Proof. The Taylor expansion of log(l — X) for \X\ < 1 gives 



7=1 



(1) 



As En > d/n for some constant d > 0, 



(2) 



°° 1 oo „oO 



j=n+l J=n+1 

SO (2), divided by log£„, tends to 0. 



f,d{l+x) ' 
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We now look at the main contribution, coming from (1). If c > 1, then 
An = o{en), so (1) is equivalent to log£n as n — oo. If < c < 1, then 
En = o(A„) SO (1) is equivalent to logA^, hence to cloge„. If c = 1, (1) is 
equivalent to (log en)l£n>A„ + (log A„)1a„>£„, that is to say log£n- Finally, 
if c = 0, as (en)" < An < 27r - (5 for ah a > 0, (1) = o(log£n)- □ 

The condition (2.2) in Theorem 2.1 remains to be shown. Since we 
have already shown that nJ2'^^_^i\anj\'^ — 0, we look for a sequence 
{nin) with nin/n and j|anjP 0- Writing as previously 

« = (ELi l/^fel)^ tlien 

n n 

El .O Oj \ i 

— log£n 3 
j=mn+l j=m„+l 

Hence any sequence (m„) with m„ = o(n), (logn — log(mn))/log £„ — ^ is 
convenient, for example nin = [n/(— loge„)J . 

2.3. Proof of Theorem 1.4 for <^ 1/n. We now need to check con- 
ditions (2.1) and (2.2) with 

-I fJ-k 



2 



and as in (2.3). In the same way as the previous paragraph, n YlJLn+i Wnj\ 
0, and (2.2) holds with nin = [n/lognj. So the last thing to prove is 

El ,2 V^- 1 1 / e ^" \ 
mnj\ = Z^l^sl^t] > - 2i ' ^ 

j=l s,t ^ j=l •> \ 

that is to say, writing Xn = e ^^"+KVn Vn 

c^,t A 1. 

1 n-*oo 

First note that with no restriction we can suppose En = 0. Indeed, if we 

(s)_ (t)> 

write Dn = e'^'^" '^^ > , and Sn <b/n for some 6 > (since ^ 1/?^)) 



1 ^ 7^ 
± \ Xn 

log n ^ — ' J n-*oo 



n j n i 
n \ ^ yn 

i=i ■' i=i ■' 



n 



< > -le"''" -1 



< b 



because |e ^ — 1| < x for x > 0. The asymptotics of Yl]=i ^ given in 
the next lemma, which concludes the proof. 

Lemma 2.3. Let (An) be a strictly positive sequence, bounded by 27: — 5 for 
some S > 0, such that — log An/ log n ^ c E [0, oo] . Then 



Y— > CM. 
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Proof. We successively treat the cases c > and c = 0. Suppose first that 
c > 0. By comparison between the Riemann sum and the corresponding 
integral, 



E 



Mn+1)A„ it " /-O- 

•J An ^ ,-=1 JjAr, 



-i;A„ 



n 



1 



+ 



J J + 1 



so ELi 



j=i i=i 
< A„(logn + l) + l. 

" " has the same limit as J^"'+^)^" V*^^ 



As c > 0, A„ ^ w ov^ 15^ ^j=i J 
as n — oo. If c > 1, nA„ — so we easily get 

1 r{n+l)An ^it I /.(n+l)A„ log(n + 1) 

/ at ~ / — — 

>g n J An t log n 7a„ * 



logn 



1 pit 
— dt 



— dt ~ / — at ~ / — — > c. 

t n^oo log n J A t n^oo log n J a t n^oo 



dt 



log n 7a„ t log n 

If < c < 1, nA„ — *■ oo. As sup^>i ^dt < oo, 

I r{n+l)A„ git _ ^ 

logn J An 

If c = 1, a distinction between the cases nA„ < 1, nA„ > 1 and the above 
reasoning gives 1 in the limit. 

If c = 0, An does not necessarily converge to anymore so another method 
is required. An elementary summation gives 

n gij A„ / 1 1 \ ^ l" 

j=i J fc=i ^ ' ^ j=i j=i 

We will choose a sequence (an) (1 < an < n) and bound X]j=i e'-'^" by k if 
/c < an, by Ke'*^^" - l)/(e'^" - 1)| < 2/|e^^" - 1| if an < /c < n. This yields 

7=1 fc=l ' I I fc=a„ ^ ^ 



( 



On e 



As An < 2-K — 5, there is a constant A > with |e^^" — 1| > A An- So 
the result follows if we can find a sequence (a„) such that and 

an An logn oo, which is true for an = [27r/AnJ. □ 

3. The central limit theorem for C 

3.1. Selberg's method. Suppose the Euler product of Q holds for 1/2 < 
9^e(s) < 1 (this is a conjecture) : then logC(s) = — YlpeV log(l— p~'*) can be 
approximated by ^p^'pP~^ ■ Let s = 1/2 + et + iiot with u uniform on (0, 1). 
As the logp's are linearly independent over Q, the terms \ p e V} 



MESOSCOPIC FLUCTUATIONS OF THE ZETA ZEROS 



11 



can be viewed as independent uniform random variables on the unit circle 
as t — >■ oo, hence it was a natural thought that a central limit theorem might 
hold for logC(s), which was indeed shown by Selberg [12]. 

The crucial point to get such arithmetical central limit theorems is the 
approximation by sufficiently short Dirichlet series. Selberg's ideas to ap- 
proximate logC appear in Goldston [6], Joyncr [9], Tsang [15] or Selberg's 
original paper [12]. More precisely, the explicit formula for C'/C) by Landau, 
gives such an approximation (x > 1, s distinct from 1, the zeros p and — 2n, 
n G N) : 

"C ~~ ^ ~rF' ^ 1-s ~ ^ p-s ^ ^ 2n + s' 

n<x p n=l 

from which we get an approximate formula for log by integration. How- 
ever, the sum over the zeros is not absolutely convergent, hence this formula 
is not sufficient. Sclbcrg found a slight change in the above formula, that 
makes a great difference because all infinite sums are now absolutely con- 
vergent : under the above hypotheses, if 

{A(n) for 1 < n < x, 
^(^)t§^ forx<n<x^ 

then 

C _ ^ A^(n) a;2(i-^) - x^'' 1 xP'' - x^^P-'^ 
C (1 — s)^logx logx^ (p — sy 

1 ^ ^-2n-s _ ^-2(2n+s) 

logx ^ (2n + s)2 

n=l ^ ' 

Assuming the Ricmann hypothesis, the above formulas give a simple expres- 
sion for {(' /C){s) for 9le(s) > 1/2 : for — oo, all terms in the infinite sums 
converge to because ^Rt{p — s) < 0. By subtle arguments, Selberg showed 
that, although RH is necessary for the almost sure coincidence between ("'/C 
and its Dirichlet series, it is not required in order to get a good approxi- 
mation. In particular, Selberg [12] (see also Joyner [9] for similar results for 
more general L- functions) proved that for any /cGN*,0<a<l, there is a 
constant c^^a such that for any 1/2 < a < 1, i"/*^ < a; < t^/^, 

2k 

logC(a + is)-^^ 



p<x 



dS < Ckn- 



In the following, we only need the case = 1 in the above formula : with 
the notations of Theorem 1.1 (w uniform on (0, 1)), 



logcQ + £* + i/i'Via;t) 
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is bounded in L^, and after normalization by _ or j^p^, it converges 
in probability to 0. Hence, Slutsky's lemma and the Cramr-Wald device 
allow us to reformulate Theorem 1.1 in the following way. 

Equivalent of Theorem 1.1. Let oj be uniform on (0, 1), £t — > 0, S> 1/logf, 
and functions < f]:^'' < • • • < //^■' < c < oo. Suppose (1.4). Then for any 
finite set of complex numbers /xi, . . . , /x^, 

1 ^ p-iu}t 

converges in law to a complex Gaussian variable with mean and variance 

e 

0-^ = ^ l/Xjf + ^727//fc(l A Cj-fe). 
j=l j^k 

If £t <C 1/logt, then the same result holds with normalization 1 / ^/log log t 
instead of l/x/^^logit in (3.1) and (1.4). 

To prove this convergence in law, we need a number-theoretic analogue 
of Theorem 2.1, stated in the next paragraph. 

3.2. An analogue of the Diaconis-Evans theorem. Heuristically, the 

following proposition stems from the linear independence of the logp's over 
Q, and the main tool to prove it is the Montgomery- Vaughan theorem. 

Note that, generally, convergence to normal variables in a number-theoretic 
context is proved thanks to the convergence of all moments (see e.g. [8]). 
The result below is a tool showing that testing the L^-convergence is suffi- 
cient. 

Proposition 3.1. Let apt (p &V,t & R.'^) be complex numbers with supp \apt\ 
and X^pldptP — > (T^ as t ^ oo. Suppose also the existence of {mt) with 
log mt I log t — ^ and 

p>mt 

Then, if u is a uniform random variable on (0, 1), 

E-\ujt law 
aptP — cfY 

as t ^ oo, Y being a standard complex normal variable. 

Remark. The condition m„ = o(n) in Theorem 2.1 is replaced here by 
log mt = o(logt). A systematic substitution n ^ logt would give the 
stronger condition mt/ log mt = o(logi) : the above proposition gives a bet- 
ter result than the one expected from the analogy between random matrices 
and number theory. 
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Proof. Condition (3.2) first allows to restrict the infinite sum over the set of 
primes V to the finite sum over V fl [2, mt]. More precisely, following [10], 
let (ttr) be complex numbers, (A^) distinct real numbers and 

Sr = min \Xr — As|. 

s^r 

The Montgomery- Vaughan theorem states that 

2 



tJo 



^ are' 



XrS 



3710 



for some 9 with \6\ < 1. We substitute above by apt and by logp, 
and restrict the sum to the p's greater than mt : there is a constant c > 
independent of p with miup^p I logP ~ logp'| > ^, so 

2 



1 

t.,0 



p>mt 



d5<5^M^(i + c'f) 



with c' bounded by Svrc. Hence the hypothesis (3.2) implies that ^^p^j^t '^ptP 
converges to in L^, so by Slutsky's lemma it is sufficient to show that 

E-iwt law 
aptp — i 

p<int 

|2 , ^2 



-iwt 



aY. 



(3.3) 

^p<mt \^pt^^ ~^ supp<^^ loptl — > 0, Theorem 4.1 in Petrov [11] 

gives the following central limit theorem : 

law 



aptf^'^" ^ (3.4) 

p<mt 

where the Wp's are independent uniform random variables on (0, 27r). The 
logp's being linearly independent over Q, it is well known that as t ^ oo 
any given finite number of the p"^*'s are asymptotically independent and 
uniform on the unit circle. The problem here is that the number of these 
random variables increases as they become independent. If this number 
increases sufficiently slowly (logmt/logt — ^ 0), one can expect that (3.4) 
implies (3.3). 

The method of moments tells us that , in order to prove the central limit 
theorem (3.3), it is sufficient to show for all positive integers a and h that 



, p<mt 



t-+oo 



with fa,b{x) = x°'x . From (3.4) we know that 



IE [fa,b I "Pte'"" I I „-^IE(/„,,(c7y)). 
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Hence it is sufficient for us to show that, for every a and b, 



E ( fa,b j XI ^P^P ' 

p<mt 



iuit 



0. (3.5) 



Let nt = \Vn [2,mt]\ and, for z = (zi, . . . , z„J G M"S write = 

(l^p<mt ^p*^'^'') ' which is 'Tif°° and (27rZ)"'' -periodic. Let its Fourier 

decomposition be f^liz) = X^^gz"* Uab(^)^^'''^ ■ write for the trans- 
lation on M"* with vector sp^^^ = s(logpi, . . . , logp„J, inspired by the proof 
of Theorem 2.1 wc can write the LHS of the above equation as is the 
uniform distribution on the Torus with dimension nj) 



it) I 



t(t). 



Our theorem will be proven if the above difference between a mean in time 
and a mean in space converges to 0, which can be seen as an ergodic result. 
The above RHS is clearly bounded by 



1 



inf^^^W Jfc-pWr 



where H^*]^ is the set of the non-zero /c's in Z"* for which u^^\{k) ^ : 
such a k can be written fc^^ - k^^\ with fc^^ G [[l,al''', fc^^) g [1,6]'**, 



k 



(1) 



a, fcf ^ + • • • + fcif = 6. 



First note that, as EfeeZ"* = (Ep<mt «pte'^^) (E 

hence for sufficiently large t 



p<mt ^P*^ 



i + b 
2 



< 



1+6 

2 



inf 



(*) 

a,b 



|/c-p(*)| 



Lemma 3.2 below and the condition log m^/ log t — >^ show that the above 
term tends to 0, concluding the proof. □ 

Lemma 3.2. For n > 1 and all k G "7^* 5, 



\k-p^'^\ > 



^^2max(a,b) ' 
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Proof. For k G Z"*, k ^ 0, let £i (resp £2) be the set of indexes i £ ll, rit} 
with hi strictly positive (resp strictly negative.) Write ui = Yli^^^pl^''^ 

and U2 = Y[iee2Pi Suppose ui > U2- Thanks to the uniqueness of 
decomposition as product of primes, ui > U2 + 1- Hence, 

\k -p^ '\ = [ui- U2) > (log ui){ui - U2) 

Ul — U2 

~ Ul ~ 

For all nt > 0, logpn^ < 21ognt. Moreover, from the decomposition k = 
_ /j{2) ^]jg previous section, we know that YlieSi — 

The case ui < U2 leads to \ k-p^^^ \ > g-26iognt^ which completes the proof. □ 

In the above proof, we showed that the remainder terms (p > mt) converge 
to in the L^-norm to simplify a problem of convergence of a sum over primes 
: this method seems to appear for the first time in Soundararajan [13]. 

3.3. Proof of Theorem 1.1 for et ^ 1/logt. To prove our equivalent 
of Theorem 1.1, we apply the above Proposition 3.1 to the random variable 
(3.1), that is to say 



if p <t, if p> t. Then clearly supp \apt\ ^ as t ^ 00. For any sequence 
< mt < t, writing a = (Efe=i lA^fel)^, 

a 



■ log Ef ^ P — log Et 

mt<p<t ^ ^ mt<p<t^ * ^ 

As J2p<t p ~ condition (3.2) is satisfied if we can find mt = 

exp(logt/6t) with — oo and ll^gl^ — : bt = — log£t for example. 

We now only need to show that J2p<t Wptl"^ ^ Sj=i '^^s^tThlJ-ti'^^ 
c^^t), which is a consequence of the following lemma. 

Lemma 3.3. Let (At) be bounded and positive. If £t — ^ 0, S> 1/logt and 
log At/ log £t ^ c G [0,00], then 

El „iAt 
> cA 1. 



log £/ pi-+^£t t^oo 

P<t 



Proof. The first step consists in showing that Ylp<t j^+'^h the 

same limit as the infinite sum _ J2peV ^+'^h ■ ^*^t' ^ stronger result 
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holds : as is sufficiently large {Sf > d/logt for some d > 0), Ylp>t p^+'^n 
is uniformly bounded : 

El vr(n) — 7r(n — 1) 

p>t ^ n>t 



?l>t ^ V ' / / 

(l + 2.,)^°'^dx + o(l), 



and this last term is bounded, for sufficiently large t (remember that 7r(x) 
X I log X from the prime number theorem) , by 



„ /•°° d.x „ r dy 

2/ = -2/ ^<oo, 

X log' los' X JO 



-d 



X^+teiTlogx Jo logy 

as shown by the change of variables y = x~'^/^°^^. Therefore the lemma is 
equivalent to 

— i > -TX97 ^cAl. 

-log£t t^oo 



The above term has the same limit as 

logC(l + 2£t-iAt) 



— E log 



i°g^*^'"V pi+2<^ty -iog£t 

because log(l — x) = —x + 0(|a;p) as x — 0, and J2p^/P^ < The 
equivalent + ~ l/x (a; ^ 0) and the condition logA^/loget ^ c yield 
the conclusion, exactly as in the end of the proof of Lemma 2.2. □ 

3.4. Proof of Theorem 1.1 for £t 1/ log t. The equivalent of Theorem 
1.1 now needs to be proven with 

1 ^ 
a = ^ V 

Vl3gbP^ttp5+-*+i/P' 

if p < i, if p > i. Reasoning as in the previous paragraph, a suitable 
choice for (mj) is mj = exp(logt/loglogt). Therefore, the only remaining 
condition to check is that, for (Aj) bounded and strictly positive such that 
— log Aj/ log logt — c and £t ^ 1/log t, 

> >cAl. 



log log t pl+^t t-»oo 
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First note that we can suppose ej = 0, because (using et < dj logt for some 
d > and once again 1 1 — | < x for a; > 0) 



E 



l+£t 



. niAt 



p<t 



Stlogp d 



< 



P 



E 



logp 



log t ^ T> 

p<t ^ 



d 



p<t ' p<t 

where the last limit makes use of the prime number theorem. The result 
therefore follows from the lemma below, a strict analogue of Lemma 2.3 used 
in the context of random matrices. 

Lemma 3.4. Let (At) be bounded and positive, such that — log / log log t — > 
c G [0, oo] . Then 

1 p'^' 

> ^cAl. 

log log t p t->oo 



Proof. As calculated in the proof of Lemma 3.3, 



iAt 



iAt 



J2^ = Y1 ^«^) - ^(^ - 1)) = (1 - 

p n 



p<t n<t 

The prime number theorem (7r(a;) ~ x/\ogx) thus implies 
p'^* f^x'^^dx ^ 



^^2 da; + o(l). 



E 



V 



= (1 - iAt 



+ (l-iAt)o 



(1 - iA*) 

7 a 



xlogx 

At log t giy^Jy 
t 



dx 



+ o(l) 



y 



x\ogx ^ 

+ (1 -iAt)o(loglogt) + o(l). 

At log t 



If c > 1, At log t — 0, so the above term is equivalent to J"^^* °^ dyjy = 

loglogt. If c < 1, At logt ^ oo so, as sup^>i ^dy| < oo, ^^^^ ^ 

tends to the same limit as /^^ dy/y = log At / log log t ^ c. Finally, if c = 1, 
the distinction between the cases At log t>l and At log t <1 and the above 
reasoning give 1 in the limit. □ 

4. Connection with spatial branching processes. 

There is no easy a priori reason why the matrix (1.6) is a covariance 
matrix. More precisely, given positive numbers ci, . . . , is there a reason 
why the symmetric matrix 

1 if i = J 

1 A inf|jj_ij Cfe if i< j 

is positive semi-definite ? This is a by-product of Theorem 1.1, and a possible 
construction for the Gaussian vector ( Yi , . . . , 1^) is as follows. Define the 

and 

1 



angles ipn \ I < k < £, hy ipn^ 



(fe) 



, 2<k<£. 



(4.1) 
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Let {Xr)r>i be independent standard complex Gaussian variables. For 1 < 
k<e, let 



Then {Y^^\ . . . , Y^^"^) is a complex Gaussian vector, and Lemma 2.3 implies 
that its covariance matrix converges to (4.1). 

Instead of finding a Gaussian vector with covariance structure (4.1), we 
consider this problem : given ci, . . . , q positive real numbers, can we find a 
centered (real or complex) Gaussian vector (Xi, . . . ,X^) with 

W.{XiXj)= inf cfe (4.2) 

i<k<j 

for alH < j ? A matrix C of type (4.1) can always be obtained as a \C' + D 
with A > 0, C of type (4.2) and D diagonal with positive entries, so the 
above problem is more general than the original one. 

Equation (4.2) is the discrete analogue of the following problem, consid- 
ered in the context of spatial branching processes by Le Gall (see e.g. [5]). 
Strictly following his work, we note e : [0, cr] — R"*" a continuous function 
such that e(0) = e{a) = 0. Le Gall associates to such a function e a con- 
tinuous tree by the following construction : each s G [0, a] corresponds to a 
vertex of the tree after identification of s and t [s ^ t) if 

e(s) = e{t) = inf e(r). 

[s,t] 

This set [0, cr]/ ~ of vertices is endowed with the partial order s ^ t (s is an 
ancestor of t) if 

e(s) = inf e(r). 

[s,t] 

Independent Brownian motions can diffuse on the distinct branches of the 
tree : this defines a Gaussian process Bu with u ^ [0,o']/ ~ (see [5] for the 
construction of this diffusion) . For s G [0, cr] writing Xg = B-g (where s is the 
equivalence class of s for ~), we get a continuous centered Gaussian process 
on [0, cr] with correlation structure 

E(X7Xt) = infe(tx), (4.3) 

which is the continuous analogue of (4.2). This construction by Le Gall 
yields a solution of our discrete problem (4.2). More precisely, suppose for 
simplicity that all the Cj's are distinct (this is not a restrictive hypothesis 
by a continuity argument), and consider the graph i i— > We say that i is 
an ancestor of j if 

Cj = inf Cfe. 
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The father of i is its nearest an- 
cestor, for the distance d{i,j) = 
\ci — Cj\. It is noted p{i)- We can 
write Cct(i) < • • • < Co-(£) for some 
permutation a, and (A/i, . . . a 
vector of independent centered com- 
plex Gaussian variables, J\fk with 
variance — Cp^^/^-^ (by convention 
Cp(cr(i)) = 0). Then the Gaussian 
vector iteratively de- 

fined by 




A/3 



Mi 



N't 



k 



satisfies (4.2), by construction. 
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